> AI Agents

Evergreen
planted Dec 27, 2025tended May 3, 2026
#moc#ai#agents#autonomous-systems

AI Agents

A map organizing my exploration of AI agents and autonomous systems.

Featured

  • Production LLM Eval Platforms β€” Multi-agent research synthesis on the production-eval landscape: the eval/observability flywheel, trace data layers, failure-mode discovery, headless agent-driven evals, AI-gateway tracing, and governance.

Research

  • Karpathy Autoresearch β€” Deep Research Report β€” Deep research on autonomous AI agents running ML experiments with one GPU per agent. Architecture, multi-agent patterns, the OpenClaw security crisis, and a four-GPU consumer-hardware replication build.
  • Agent Harness Engineering β€” Synthesis β€” Synthesis of how to make AI coding agents work reliably. Karpathy's skill tree, Boris Cherny's thread taxonomy, MercadoLibre's four levers at 20K-dev scale, OpenAI Codex team's AGENTS.md pattern.
  • x402 Implementation Guide β€” Production build journal for an x402 pay-per-call API. Hono + Coinbase CDP facilitator + EIP-3009. Specific package versions, debugging playbook, hardening patterns.
  • x402 Competitive Landscape β€” Live Services Analysis β€” Scrape of the x402 ecosystem (~230 services across Bazaar + ecosystem + the402.ai). Where the gaps are, where the slop is, what to build.

What I'm Building

  • claude-autoresearch β€” Plugin for Claude Code that runs autonomous, milestone-verified research loops.
  • agent-orchestrator β€” Always-alive daemon for spawning supervised Claude agents from CLAUDE.md harness templates.
  • research-orchestrator β€” Multi-Claude parallel-research pipeline with shared memory and a synthesizer/judge stage.
  • Autonomous Agent Arena β€” Three bots running 24/7 on arenabot.io against local Ollama on a four-GPU rig.
  • Infinite Brainstorm β€” Agent-native infinite canvas. Humans and agents both edit the same board.json.

Getting Started

New to AI agents? Start here:

Core Concepts

Agent Architecture

Tool Integration

  • Tool Use and Function Calling 🌿 β€” Extending agent capabilities
  • Database access, web search, code execution
  • Tool composition and caching

Practical Guides

Framework-Specific

Development Workflow

Production Considerations

Operations

Cost & Performance

  • Token budgets and caching
  • Rate limiting and quotas
  • Latency optimization

Advanced Topics

Collaboration

  • Multi-Agent Systems 🌿 β€” Hierarchical, parallel, and democratic patterns
  • Agent communication protocols
  • Conflict resolution

Memory & Learning

  • Agent Memory Systems 🌿 β€” Vector stores, episodic memory, summarization
  • Long-term knowledge retention
  • Privacy and forgetting

Project Documentation

Experiments

  • eliza-001 β€” First AI agent experiment: ElizaOS framework exploration
  • eliza-002 β€” Agent capabilities and architecture deep dive

Case Studies

To be added as I build more agents

Learning Path

Beginner (Week 1-2)

  1. Read AI Agents Fundamentals
  2. Try Claude Agent Patterns examples
  3. Build a simple ReAct agent
  4. Understand Tool Use and Function Calling

Intermediate (Week 3-4)

  1. Study Agent Frameworks Comparison
  2. Implement Agent Memory Systems
  3. Learn Agent Security Considerations
  4. Build a multi-tool agent

Advanced (Week 5+)

  1. Explore Multi-Agent Systems
  2. Master Production Agent Deployment
  3. Set up Agent Evaluation and Testing
  4. Deploy production agent

Connection Points

External resources: